Method for data retention in a data cache and data storage system

ABSTRACT

A method for data retention in a data cache and a data storage system are provided. The data storage system ( 100 ) includes a storage controller ( 102 ) with a cache ( 103 ) and a data storage means ( 106 ). The cache ( 103 ) has a first least recently used list ( 104 ) for referencing dirty data which is stored in the cache ( 103 ), and a second least recently used list ( 105 ) for clean data in the cache ( 103 ). Dirty data is destaged from the cache ( 103 ) when it reaches the tail of the first least recently used list ( 104 ) and clean data is purged from the cache ( 103 ) when it reaches the tail of the second least recently used list ( 105 ).

FIELD OF THE INVENTION

[0001] This invention relates to data storage systems. In particular,this invention relates to a method and system for data retention in adata cache.

BACKGROUND OF THE INVENTION

[0002] In existing, well-known write caching systems, data istransferred from a host into a cache on a storage controller. The datais retained temporarily in the cache until it is subsequently written(“destaged”) to a disk drive or RAID array.

[0003] In order to select the region of data to destage next, thecontroller firmware uses an LRU (Least Recently Used) algorithm. The useof an LRU algorithm increases the probability of the followingadvantageous events happening to the data in the cache.

[0004] 1. Data in the cache may be overwritten with updated data beforebeing destaged, so that write operations from the host result in onlyone destage operation to the disk, thereby reducing disk utilisation.

[0005] 2. Data in the cache may be combined with logically-adjacent data(coalesced) to form a complete stride for destaging to a RAID 5 array,thereby avoiding the read-modify-write penalty typically encounteredwhen writing to a RAID 5 array.

[0006] 3. An attempt by the host to read data which it has recentlywritten may be serviced from the cache without the overhead ofretrieving the required data from the disk. This improves the readresponse time.

[0007] Data in the cache must be protected against loss during unplannedevents (e.g. resets or power outages). This is typically achieved byincluding battery backed memory or UPS (uninterruptible power supply) toallow the data to be retained during such events.

[0008] However, the provision of such backup power is difficult andexpensive so a design decision is often taken such that the controllermay not have sufficient power available to retain the contents of all ofits cache memory. Consequently, the controller has areas of cache memorywhich cannot be used for write caching (since the data stored thereinwould be vulnerable to loss).

[0009] Such areas of the cache may, however, be used as a read cache(since this data does not need to be written to the storage device).Such a read cache would be used independently of the write cache.

[0010] It is an aim of the present invention to provide a data cache inwhich the write and read areas of the cache are not separated and thesame area of memory can function as either write cache or read cache.Cached read data and cached write data are handled in the samecontiguous areas of memory.

[0011] When a write is received from the host and data is transferredinto the cache, it is then known as “dirty” data. Sometime later it isdestaged to the disk but may be retained in the cache. It is then knownas “clean” data.

[0012] If a read command is received from the host for the region ofmemory corresponding to the cached data then the read command may besatisfied from the clean data in the cache, or a combination ofcontiguous clean and dirty spans of data.

[0013] The clean data in the cache needs to be discarded at some pointto allow higher-priority clean data to be retained. The problem isselecting the next clean data entry to discard. This process is known aspurging.

DISCLOSURE OF THE INVENTION

[0014] According to a first aspect of the present invention there isprovided a method for data retention in a data cache, comprising:referencing dirty data stored in a cache in a first least recently usedlist; and referencing clean data in the cache in a second least recentlyused list; wherein dirty data is destaged from the cache when it reachesthe tail of the first least recently used list and clean data is purgedfrom the cache when it reaches the tail of the second least recentlyused list.

[0015] Dirty data which is destaged to a data storage means may have acopy of the data retained in the cache as clean data which is deletedfrom the first list and added to the second list.

[0016] A read command which is a cache miss may fetch data from a datastorage means and the data may be retained in the cache with a referencein the second list.

[0017] The method may include keeping a flag with each data reference inthe first list indicating whether or not the data has been read whilston the first list. If the data was read when referenced in the firstlist, the data may be added to the head of the second list when the datais destaged. If the data was not read when referenced in the first list,the data may be either maintained in its position in the second list ordiscarded.

[0018] The flag may include a timestamp each time the data is read andthe timestamp may be used to prioritise the position of the datareference in the second list.

[0019] Data may be partly dirty and partly clean and may be referencedin both the first and second lists.

[0020] According to a second aspect of the present invention there isprovided a data storage system comprising: a storage controllerincluding a cache; a data storage means; and the cache has a first leastrecently used list for referencing dirty data which is stored in thecache, and a second least recently used list for referencing clean data;wherein dirty data is destaged from the cache when it reaches the tailof the first least recently used list and clean data is purged from thecache when it reaches the tail of the second least recently used list.

[0021] Dirty data which is destaged to a data storage means may have acopy of the data retained in the cache as clean data which is deletedfrom the first list and added to the second list.

[0022] A read command which is a cache miss may fetch data from the datastorage means and the data may be retained in the cache with a referencein the second list.

[0023] A flag may be provided with each data reference in the first listindicating whether or not the data has been read whilst on the firstlist. If the data was read when referenced in the first list, the datamay be added to the head of the second list when the data is destaged.If the data was not read when referenced in the first list, the data maybe either maintained in its position in the second list or discarded.

[0024] The flag may include a timestamp each time the data is read andthe timestamp may be used to prioritise the position of the datareference in the second list.

[0025] Data may be partly dirty and partly clean and may be referencedin both the first and second lists.

[0026] According to a third aspect of the present invention there isprovided a computer program product stored on a computer readablestorage medium, comprising computer readable program code means forretaining data in a data cache by performing the steps of: referencingdirty data stored in a cache in a first least recently used list; andreferencing clean data in the cache in a second least recently usedlist; wherein dirty data is destaged from the cache when it reaches thetail of the first least recently used list and clean data is purged fromthe cache when it reaches the tail of the second least recently usedlist.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] An embodiment of the present invention will now be described, bymeans of an example only, with reference to the accompanying drawings,in which:

[0028]FIG. 1 is a block diagram of a data storage system in accordancewith the present invention;

[0029]FIG. 2 is a block diagram of part of the data storage system ofFIG. 1; and

[0030]FIG. 3 is a flow diagram of a method in accordance with thepresent invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0031] Referring to FIG. 1, a data storage system 100 is shown. The datastorage system 100 of the figure is a simple system with a single hostcomputer 101. Multiple host computers may be provided sharing commonstorage means.

[0032] A storage controller 102 controls the storage in a data storagemeans 106 which may be any storage medium including a disk drive or anarray of disk drives 106, for example, a RAID array of disk drives couldbe used. The storage controller 102 has a cache 103 in which data istemporarily retained until it is subsequently destaged to the datastorage means 106.

[0033] Data regions are stored in the cache 103. The storage controller102 uses an algorithm to determine which region of data in the cache 103to destage next.

[0034] The algorithm uses two lists 104, 105 both of which are LRU(Least Recently Used) lists. The lists contain entries referencing thedata regions stored in the cache 103. The entries are data regiondescriptors.

[0035] A data region is an arbitrary unit of data which may be referredto as a track. In an example implementation, a track is 64 k bytes. Adata descriptor on the LRW or LRR lists 104, 105 represents a track andeach track is represented on each list 104, 105 exactly 0 or 1 time. Atrack may have subsets referred to as pages. In an exampleimplementation, a page is 4 k bytes giving 16 pages in a track. Each ofthe pages in a track may be dirty, clean or absent. In practice, theremay also be subsets of pages.

[0036] The first list 104 is for dirty data which is data that has beenreceived from the host 101. The first list 104 is referred to as the LRW(Least Recently Written) list. The second list 105 is for clean datawhich is data which has been destaged to the data storage means 106 anda copy is retained in the cache 103. The second list 105 is referred toas the LRR (Least Recently Read) list.

[0037] Referring to FIG. 2, a detail of FIG. 1 is provided showing thecache 103 with the LRW list 104 and the LRR list 105. A data region inthe cache 103 will always be on at least one list 104, 105 and may be onboth lists.

[0038] When the dirty data is initially stored 200 in the cache 103, acorresponding entry 201 is created for it on the dirty LRW list 104.When the data is destaged and marked clean, it is deleted from the LRWlist 104 and added 202 to the LRR list 105.

[0039] Additionally, a data region may be partly dirty and partly clean.As described above, a data region in the form of a track may have somedirty pages and some clean pages. In this case the track would be onboth lists 104, 105, since it must be possible to find it both whensearching for a destage candidate and when searching for a purgecandidate. Individual pages can be destaged or purged, rather than doingthis at track level.

[0040] There is also another route onto the LRR list 105. In a generalread/write cache 103, there are read commands from the host 101 whichare cache misses. In this case, data is fetched from the data storagemeans 106 and may be retained in the cache 103 to satisfy further readcommands from the host 101. A corresponding entry 203 is made for thedata on the LRR list 105.

[0041] This is particularly beneficial in an environment where thestorage controller 102 may be accessed from multiple hosts, sincemultiple hosts often utilise some regions of the disks for storingshared data and consequently multiple hosts may read the same diskregion frequently.

[0042] There is a problem of how to assign suitable priority to datawhich was dirty but has been destaged so is now marked as clean. Thisdata region needs to be deleted from the LRW list and, potentially,added to the LRR list, if it is not there already. This data was createdin cache some while ago so to add it to the “recent” end of the LRR listwould be giving it excessive priority. Conversely, to add it to the“stale” end of the LRR list would unduly depress its priority and wouldbe useless—it would be the next candidate for purging so, in a busysystem, would be immediately discarded. Adding it to the middle of theLRR list would be arbitrary. This would also be potentially difficult asthe middle point of the LRR list is not tracked.

[0043] In order to overcome this problem a flag is kept with each dataregion descriptor in the lists 104, 105, indicating whether or not thedata region was ever read while it contained dirty data.

[0044] If a data region was read while dirty, then it is likely thatanother host will also read the same data region in the near future.Therefore, the data region is added to the head of the LRR list 105, ifit is not already in the LRR list 105.

[0045] If the data region was not read while dirty then it is lesslikely that it will be read in the near future so it is not moved on theLRR list 105. If the data region is already in the LRR list 105 then itsposition in the list is unchanged. If the data region is not in the LRRlist 105 then the data region is discarded.

[0046] A further enhancement to the use of the has been “read” flag isto timestamp the region of cached data each time it is read. Using thisapproach, if a data region was read a long time ago it can be treated ashaving lower read-retention priority so the decision can be made not toadd it to the LRR list 105.

[0047] Referring to FIG. 3, a flow diagram shows a method of referencingdata regions in the lists 104, 105. A data write is first received 301in the cache. A data descriptor for the data is input 302 at the head ofthe LRW list as dirty data. A flag is kept 303 with the data descriptorindicating if the data is read. The data descriptor moves down the LRWlist and, when it reaches 304 the tail of the LRW list, it is destagedto data storage means.

[0048] It is then determined 305 if the data descriptor is already inthe LRR list. If it is already in the LRR list, the data descriptor isleft 306 where it is in the LRR list.

[0049] If the data descriptor is not already in the LRR list, it is thendetermined 307 if the data has been read whilst it was dirty. If thedata has been read whilst dirty, the data descriptor is sent 308 to thehead of the LRR list. If the data has not been read whilst dirty, thedata is discarded.

[0050] The following is a detailed description of the described method.The following should be noted.

[0051] Virtual Track (VT) is the jargon used for a data region in thecache, which contains some dirty data, some clean data or both.

[0052] Cache directory (CD) is the jargon used for the overall directoryof cache elements.

[0053] To be considered for a read or write hit, or for destaging orpurging, a VT must be in the CD.

[0054] Two queues are maintained:

[0055] LRW queue of VTs with ANY pages containing some dirty data.

[0056] LRR queue of VTs with ANY pages containing no dirty data.

[0057] General Rules:

[0058] VTs get added/moved to the head of the LRW queue whenever theyare populated with one or more dirty sectors.

[0059] VTs get added/moved to the head of the LRR queue whenever theyare read and contain a clean page.

[0060] VTs which get read have their “read” flag set.

[0061] When a VT which is not already on the LRR queue is destaged andmarked clean, it is added to the head of the LRR queue if the “read”flag is set. Otherwise it is deleted.

[0062] Rules in detail:

[0063] Dirty VT inserted into CD:

[0064] The VT is added to the head of the LRW queue.

[0065] Clean VT inserted into CD:

[0066] The VT is added to the head of the LRR queue.

[0067] Dirty data merged into VT in LRW queue:

[0068] The VT is moved to the head of the LRW queue.

[0069] Dirty data merged into VT in LRR queue:

[0070] The VT is added to the head of the LRW queue.

[0071] VT remains in LRR queue if it retains any clean pages.

[0072] Dirty data merged into VT in both queues:

[0073] The VT is moved to the head of the LRW queue.

[0074] VT remains in LRR queue if it retains any clean pages.

[0075] Clean data merged into VT in LRW queue:

[0076] VT is left at current location in LRW queue.

[0077] VT is added to the head of the LRR queue if it now contains anyclean pages.

[0078] “Read” flag is set.

[0079] Clean data merged into VT in LRR queue:

[0080] VT is moved to the head of the LRR queue.

[0081] Clean data merged into VT in both queues:

[0082] VT is left at current location in LRW queue.

[0083] VT is moved to the head of the LRR queue.

[0084] “Read” flag is cleared.

[0085] Last clean page purged from VT at end of LRR queue:

[0086] VT is deleted from end of LRR queue.

[0087] Dirty span removed by invalidation:

[0088] If VT no longer contains any dirty data it is deleted from LRWqueue.

[0089] Clean span removed by invalidation:

[0090] If VT no longer contains any clean pages it is deleted from LRRqueue.

[0091] Mixed span removed by invalidation:

[0092] If VT no longer contains any clean pages it is deleted from LRRqueue.

[0093] If VT no longer contains any dirty data it is deleted from LRWqueue.

[0094] The above method has the advantage that it identifies data to bepreserved in the cache and data which need not be preserved and can bedestaged to a data storage means.

[0095] Only dirty regions of the cache are protected against powerfailure. At runtime a table of the dirty pages is maintained so that, ifpower fails, the pages which need to be backed up can be identified.

[0096] The described method particularly improves write performance forRAID 5 storage arrays by permitting data coalescing into full-stridewrites.

[0097] The described technology could be used in disk drives, diskcontrollers/adapters and file servers.

[0098] Modifications and improvements may be made to the foregoingwithout departing from the scope of the present invention.

What is claimed is:
 1. A method for data retention in a data cache,comprising: referencing dirty data stored in a cache in a first leastrecently used list; and referencing clean data in the cache in a secondleast recently used list; wherein dirty data is destaged from the cachewhen it reaches the tail of the first least recently used list and cleandata is purged from the cache when it reaches the tail of the secondleast recently used list.
 2. A method as claimed in claim 1, whereindirty data which is destaged to a data storage means and a copy of thedata is retained in the cache as clean data is deleted from the firstlist and added to the second list.
 3. A method as claimed in claim 1,wherein a read command which is a cache miss fetches data from a datastorage means and the data is retained in the cache with a reference inthe second list.
 4. A method as claimed in claim 1, wherein the methodincludes keeping a flag with each data reference in the first listindicating whether or not the data has been read whilst on the firstlist.
 5. A method as claimed in claim 1, wherein, if the data was readwhen referenced in the first list, the data is added to the head of thesecond list when the data is destaged.
 6. A method as claimed in claim1, wherein, if the data was not read when referenced in the first list,the data is either maintained in its position in the second list ordiscarded.
 7. A method as claimed in claim 4, wherein the flag includesa timestamp each time the data is read and the timestamp is used toprioritise the position of the data reference in the second list.
 8. Amethod as claimed in claim 1, wherein data is partly dirty and partlyclean and is referenced in both the first and second lists.
 9. A datastorage system comprising: a storage controller including a cache; adata storage means; and the cache has a first least recently used listfor referencing dirty data which is stored in the cache, and a secondleast recently used list for referencing clean data; wherein dirty datais destaged from the cache when it reaches the tail of the first leastrecently used list and clean data is purged from the cache when itreaches the tail of the second least recently used list.
 10. A datastorage system as claimed in claim 9, wherein dirty data which isdestaged to a data storage means and a copy of the data is retained inthe cache as clean data is deleted from the first list and added to thesecond list.
 11. A data storage system as claimed in claim 9, wherein aread command which is a cache miss fetches data from the data storagemeans and the data is retained in the cache with a reference in thesecond list.
 12. A data storage system as claimed in claim 9, wherein aflag is provided with each data reference in the first list indicatingwhether or not the data has been read whilst on the first list.
 13. Adata storage system as claimed in claim 9, wherein, if the data was readwhen referenced in the first list, the data is added to the head of thesecond list when the data is destaged.
 14. A data storage system asclaimed in claim 9, wherein, if the data was not read when referenced inthe first list, the data is either maintained in its position in thesecond list or discarded.
 15. A data storage system as claimed in claim12, wherein the flag includes a timestamp each time the data is read andthe timestamp is used to prioritise the position of the data referencein the second list.
 16. A data storage system as claimed in claim 9,wherein data is partly dirty and partly clean and is referenced in boththe first and second lists.
 17. A computer program product stored on acomputer readable storage medium, comprising computer readable programcode means for retaining data in a data cache by performing the stepsof: referencing dirty data stored in a cache in a first least recentlyused list; and referencing clean data in the cache in a second leastrecently used list; wherein dirty data is destaged from the cache whenit reaches the tail of the first least recently used list and clean datais purged from the cache when it reaches the tail of the second leastrecently used list.